Goto

Collaborating Authors

 shadow removal


MatteViT: High-Frequency-Aware Document Shadow Removal with Shadow Matte Guidance

Kim, Chaewon, Lee, Seoyeon, Park, Jonghyuk

arXiv.org Artificial Intelligence

Document shadow removal is essential for enhancing the clarity of digitized documents. Preserving high-frequency details (e.g., text edges and lines) is critical in this process because shadows often obscure or distort fine structures. This paper proposes a matte vision transformer (MatteViT), a novel shadow removal framework that applies spatial and frequency-domain information to eliminate shadows while preserving fine-grained structural details. T o effectively retain these details, we employ two preservation strategies. First, our method introduces a lightweight high-frequency amplification module (HF AM) that decomposes and adap-tively amplifies high-frequency components. Second, we present a continuous luminance-based shadow matte, generated using a custom-built matte dataset and shadow matte generator, which provides precise spatial guidance from the earliest processing stage. These strategies enable the model to accurately identify fine-grained regions and restore them with high fidelity. Extensive experiments on public benchmarks (RDD and Kligler) demonstrate that Matte-ViT achieves state-of-the-art performance, providing a robust and practical solution for real-world document shadow removal. Furthermore, the proposed method better preserves text-level details in downstream tasks, such as optical character recognition, improving recognition performance over prior methods.






Generative Status Estimation and Information Decoupling for Image Rain Removal - Supplementary Material - Di Lin

Neural Information Processing Systems

The feature masking contains two convolutional layers. It computes the rain (or object) feature map. We use Adam solver to optimize the parameters of SEIDNet. The performances are reported on the test set of Rain100H. We report the results in Table 1 (also see Table 1 of the main paper).



Elucidating the Design Space of Arbitrary-Noise-Based Diffusion Models

Qiu, Xingyu, Yang, Mengying, Ma, Xinghua, Liang, Dong, Li, Yuzhen, Li, Fanding, Luo, Gongning, Wang, Wei, Wang, Kuanquan, Li, Shuo

arXiv.org Artificial Intelligence

EDM elucidates the unified design space of diffusion models, yet its fixed noise patterns restricted to pure Gaussian noise, limit advancements in image restoration. Our study indicates that forcibly injecting Gaussian noise corrupts the degraded images, overextends the image transformation distance, and increases restoration complexity. To address this problem, our proposed EDA Elucidates the Design space of Arbitrary-noise-based diffusion models. Theoretically, EDA expands the freedom of noise pattern while preserving the original module flexibility of EDM, with rigorous proof that increased noise complexity incurs no additional computational overhead during restoration. EDA is validated on three typical tasks: MRI bias field correction (global smooth noise), CT metal artifact reduction (global sharp noise), and natural image shadow removal (local boundary-aware noise). With only 5 sampling steps, EDA outperforms most task-specific methods and achieves state-of-the-art performance in bias field correction and shadow removal.


DocShaDiffusion: Diffusion Model in Latent Space for Document Image Shadow Removal

Liu, Wenjie, Wang, Bingshu, Wang, Ze, Chen, C. L. Philip

arXiv.org Artificial Intelligence

Document shadow removal is a crucial task in the field of document image enhancement. However, existing methods tend to remove shadows with constant color background and ignore color shadows. In this paper, we first design a diffusion model in latent space for document image shadow removal, called DocShaDiffusion. It translates shadow images from pixel space to latent space, enabling the model to more easily capture essential features. To address the issue of color shadows, we design a shadow soft-mask generation module (SSGM). It is able to produce accurate shadow mask and add noise into shadow regions specially. Guided by the shadow mask, a shadow mask-aware guided diffusion module (SMGDM) is proposed to remove shadows from document images by supervising the diffusion and denoising process. We also propose a shadow-robust perceptual feature loss to preserve details and structures in document images. Moreover, we develop a large-scale synthetic document color shadow removal dataset (SDCSRD). It simulates the distribution of realistic color shadows and provides powerful supports for the training of models. Experiments on three public datasets validate the proposed method's superiority over state-of-the-art. Our code and dataset will be publicly available.


Latent Feature-Guided Diffusion Models for Shadow Removal

Mei, Kangfu, Figueroa, Luis, Lin, Zhe, Ding, Zhihong, Cohen, Scott, Patel, Vishal M.

arXiv.org Artificial Intelligence

Motivated by the success of diffusionbased Recovering textures under shadows has remained a challenging image restoration models [38, 41], we adapt diffusion problem due to the difficulty of inferring shadowfree models for the task of shadow removal by conditioning on scenes from shadow images. In this paper, we propose the input shadow image and corresponding shadow mask as the use of diffusion models as they offer a promising approach a baseline approach to generate shadow-free images. However, to gradually refine the details of shadow regions preserving and generating high-fidelity textures and during the diffusion process. Our method improves this colors in the shadow region after removal is non-trivial. The process by conditioning on a learned latent feature space baseline model appears to favor borrowing textures from that inherits the characteristics of shadow-free images, thus the surrounding non-shadow areas rather than focusing on avoiding the limitation of conventional methods that condition restoring the original details underneath the shadow, which on degraded images only. Additionally, we propose results in incorrect color mixtures and loss of detail in the to alleviate potential local optima during training by fusing shadow region. In Figure 1, we show one of the representative noise features with the diffusion network. We demonstrate issues of image-mask conditioning, i.e., the model synthesizes the effectiveness of our approach which outperforms results containing an incorrect color mixture.